We are going to look at instacart dataset.
library(tidyverse)
library(p8105.datasets)
library(plotly)
Since the dataset is too large, we will take a random sample of it to do analysis.
set.seed(1234)
instacart = sample_n(instacart, 5000)
instacart =
instacart %>%
select(order_id,
product_id,
user_id,
order_hour_of_day,
days_since_prior_order,
product_name,
aisle,
department)
data("instacart")
Among all department, when do customers usually place an order within a day?
y <- list(
title = "Hour of a day"
)
instacart %>%
plot_ly(x = ~department, y = ~order_hour_of_day, type = "box") %>%
layout(yaxis = y)